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■ Abstract. A probabilistic method for solving the Monge-Kantorovich mass transport 
V . problem on R'* is introduced. A system of empirical measures of independent particles 

' is built in such a way that it obeys a doubly indexed large deviation principle with an 

■ optimal transport cost as its rate function. As a consequence, new approximation results 
for the optimal cost function and the optimal transport plans are derived. They follow 
from the F-convergence of a sequence of normalized relative entropies toward the optimal 
transport cost. A wide class of cost functions including the standard power cost functions 



O 



00 ' \x — enter this framework. 

Oh 

^ . 1. Introduction 

> . 

This paper introduces a probabilistic method for solving the Monge-Kantorovich mass 
transport problem. 

1.1. The Monge-Kantorovich problem. Let ^ and v be two probability measures 
on seen as mass distributions. One wants to transfer ^ to u with a minimal cost, 
^ ' given that transporting a unit mass from xq to Xi costs c(xo,a;i). This means that one 
searches for a transport plan xi = T{xo) such that the image measure T o fj, is u and 
J^d c{xo,T{xo)) fi{dxo) is minimal. This problem was addressed by G. Monge [T7| at the 
eighteenth century. In the 40's, L.V. Kantorovich |12], [13] proposed a relaxed version of 
Monge problem by allowing each cell of mass at xq to crumble into powder so that it can 
be tranfered to several Xi's. In mathematical terms, one searches for a probability measure 
p on M'^xM'^ whose marginal measures po{dxo) = p{dxQ x W^) and pi{dxi) = p(R'^ x dxi) 
^ I satisfy po = p and pi = u and such that /jgd^d c(xo, Xi) p((ixo(ia;i) is minimal. Let us 
I denote V^d and V]^2d the sets of all probabihty measures on M'' and M''xM''. For each p 
and u in V^d, we face the optimization problem 

(MK) minimize / c{xo, xi) p{dxodxi) subject to p G n(/i, z/) 



o 



where the cost function c : M'^ x M'^ i— > [0, +oo] is assumed to be measurable and 

n(/i, u) = {pe Vu2d; po = p,pi = v} 

is the set of all probability measures on x R"' with marginals p and v. This problem is 
called the Monge-Kantorovich mass transport problem. Monge problem corresponds to 
the transport plans p{dxodxi) = p{dxo)6T(xo){dxi) where S stands for the Dirac measure. 
Kantorovich's relaxation procedure embeds Monge's nonlinear problem in the linear pro- 
graming problem (MK). 
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The value of (MK) is the transportation cost defined for all fi and u in V^d by 

(1.1) Tc{ix,iy) := inf / c{xo^ Xi) p{dxodxi). 

The special cost function Cp(xo, Xi) = \xi — Xo\p with p > I, leads to the Wassertein metric 

1.2. Which large deviations? As the title of the paper indicates, our probabilistic 
approach of Monge-Kantorovich problem is in terms of large deviations. One can interpret 
/i and u respectively as the distributions of the initial and final random positions Xq and 
Xi of a random process {Xt)o<t<i- In the present paper, only the couple of initial and 
final positions (Xo,Xi) is considered. 

Our aim is to obtain a Large Deviation Principle (LDP) in the rate function of 
which is z/ 1-^ 7^(/i, i/) where fi is fixed. The definition of a LDP is recalled at (11.61) . 
General cost functions will be considered in the article but for the sake of clarity, in this 
introductory section our procedure is described in the special case of the quadratic cost 
funtion c( P/2. For each integer k > 1, take a system of n independent 

random couples (X^ -(0), j(l))i<i<n which is described as follows. For each i, the initial 
position X^^(O) = is deterministic and the final position is 

x^(i) = ^„,, + y,/v^ 

where the Fj's are independent copies of a standard normal vector in M*^. Consider the 
initial mass distribution fi as fixed and deterministic and choose the initial positions Zn,i 
in such a way that 

1 " 

hm - V (5^^^ = /i. 

n— >oo Tl 

i=\ 

The empirical measure of the final positions is 

1 " 

1=1 

It is a random element of V^d. An easy variation of Sanov's theorem states that for each 
k the system {X^}„>i obeys the LDP in V^d with speed n and the rate function 

(1.2) V ^V^d^ inf Il{p\'K^) e [0,oo]. 

pen(/i,i/) 

Here, i7(p|7r^) is the relative entropy (see fl2.15p for its definition) of p with respect to 
and tt'^ G V^2d is the law of {Z, Z + Y/ y/k) where Z and Y are independent, the law of Z 
is /i and y is a standard normal vector. On the other hand, {Y/y/k}k>i obeys the LDP 
in R'^ as k tends to infinity with speed k and rate function c{u) = \u\'^/2. 
Since 

(i) the speed of the LDP for {Y/\/k}k>i is k and 

(ii) the rate functions (11.21) and c{u) = \u\'^/2 are reminiscent of % given at (II. ip . 
it wouldn't be surprising that 

(i) the order of magnitude of H{p\7r'') is k and 

(ii) one should mix together two types of LDPs with n and k tending to infinity, in 
order to obtain some LDP with the rate function i> Tdp, i^)- 
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Indeed, denoting for each v G V^d with fixed p 

T{y) = Tc(/i,z/), 
it will be proved that the following F-convergence result 



inf H{p\Ti^)/k and 
pen(/i,i.) 



;i.3) 



F- lim Tk = T 



holds. As a consequence of this convergence result, for each z/ G Pi 
sequence {vk)k>i such that 



there exists a 



;i-4) 



Ymv = V and lim inf H{p\t[ )/k = Tc{p,v). 



Theorem 12.91 is the main result of the paper. It states that {iV^}fe,n>i obeys the doubly 
indexed LDP as n first tends to infinity, then k tends to infinity with speed kn and rate 
function T, see Definition 12.51 for the notion of doubly indexed LDP. 

1.3. An approximation procedure. The F-limit (11. 3p suggests that the sequence of 
minimizers pi of H{p\tt'') subject to the constraint p G n(/i, u) should converge as k tends 
to infinity to some minimizer of p J^2d cdp subject to the same constraint p G n(/i, z/). 
This fails in many situations. Consider for instance a purely atomic initial measure p and 
a family of atomic probability measures ir^. Although T{i>) may be finite for some diffuse 
final measure z/, there are no p in n(/i, z/) which are absolutely continuous with respect to 
TT*"' since vrf is atomic. Hence, Tk{i>) = +oo for all k, and there are no minimizers pi at 
all. To take this phenomenon into account, one can think of the minimization problems 

(MKfc) minimize H{p\TT^)/k subject to p G n(/i, z/^) 

where {vk)k>i satisfies (II. 4p . I didn't succeed in proving that limfc^oo( [MKfcP = (IMKp in 
the sense of F-convergence. 

Alternately, one can relax the constraint pi = z/ by means of a continuous penalization 
sequence and consider the three minimization problems 



(MK^ 
(MK-^ 



(MK) 



minimize H{p\Tt^)/k + ad{pi,i') subject to Pq = p 



minimize 



minimize 



cdp + ad{pi, v) subject to Po = P 



/ cdp subject to p G n(p, z/) 



Note that (MK^ 



where k,a >1 are intended to tend to infinity and c?(pi, v) is some distance between pi 
and v which is compatible with the narrow topology of V^d. 

is a strictly convex problem while flMK^ and flMKj) are not. As a 
admits a unique minimizer p^ while (IMK°P and flMKP may admit 
proved by means of another F-convergence result that 



consequence (MK^^ 
several ones 



t wil 



;i.5) 



lim (pvTKfj) = flMK^ and lim flMK^ = flMKl) . 



These formulas are to be understood at a formal level. It means in particular that 
for each a, lim/c^oo inf ( MK^ ) = inf (]MK"P and all the limit points of the relatively 
compact sequence (Pfc)fc>i are minimizers of the limiting problem (lMK"p . Similarly, 
lim„,_.^ inf (1MK"P = inf (IMKp and denoting p" a minimizer of (lMK"p . any limit point 
of the relatively compact sequence (p")q>i is a minimizer of the limiting problem (IMKp . 
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1.4. Some comment about the results of this paper. The doubly indexed LDP for 
{N^}k n>i, the limit fll.3p and the approximation procedure f 1 1.5 1) are new results. Large 
deviations have only been used as a guideline to obtain the analytical results (11. 3p and 

In the rest of the paper not only the quadratic cost is considered but a much wider 
class of cost functions. In particular, the above mentioned results hold true for the usual 
power cost functions c{u) = \u\^ with j9 > 0. Note that the convexity of c is not required. 

We choosed M.'^ as the surrounding space to make the presentation of the results easier. 
It is by no way a limitation. Our main large deviation result (Theorem 15. ip is stated with 
Polish spaces. On the other hand, the proofs of our convergence results mainly rely on 
P-convergence. We have done them in M'', but their extension to a Polish space is obvious. 

As a by-product of our approach, the Kantorovich duality (P^, Theorem 1.3) is recov- 
ered, see Theorems 15.11 and 16 . 2[ This provides a new proof of it, although not the shortest 
one. 

1.5. Literature. Since Brenier's note [5j in 1987 which was motivated by fluid mechan- 
ics, optimal transport is a very active area of applied mathematics. For a comprehen- 
sive account on optimal transport theory, we refer to the monographs of Rachev and 
Riischendorf [12] and Villani [21]. Villani's recent Saint-Flour lecture notes [23] are up- 
to-date and aimed at a probalistic reader. They introduce newly born techniques and 
offer a very long reference list. 

Although optimal transport has important consequences in probability theory (Wasser- 
stein's metrics or transportation inequalities for instance), it has seldom been studied 
from a probabilistic point of view. Let us cite among others the contributions of Feyel 
and Ustiinel [lOj, [llj about the Monge-Kantorovich problem on Wiener space. Recently, 
Mikami [16| has obtained a probabilistic proof of the existence of a solution to Monge's 
problem with a quadratic cost by means of an approximation procedure by /i-processes. 
His approach is based on optimal control techniques. 

Doubly indexed LDs of empirical measures have been studied by Boucher, Ellis and 
Turkington in [3]. In [H], the tight connection between doubly indexed LDs and the 
P-convergence of LD rate functions is stressed. This will be used in the present article. 

1.6. P-convergence. The P-convergence is a useful tool which is going to be used re- 
peatedly. We refer to the monograph of G. Dal Maso [15] for a clear exposition of the 
subject. Precise references to the invoked theorems in [15] will be written all along the 
paper. 

Recall that if it exists, the P-limit of the sequence (/n)n>i of (— oo, oo]-valued functions 
on a topological space X is given for all x in X by 

P- lim fn{x) = sup lim inf fn{y) 

where M{x) is the set of all neighbourhoods of x. This notion of convergence is well- 
designed for minimization problems. Denoting / = P- lim^^oo fn and taking {xn) a con- 
verging sequence of minimizers of (/„) with lim^^ooa^n = a;*, if (/n)n>i is equi-coercive 
we have lim^^oo inf /„ = inf / and x* is a minimizer of /. 

1.7. Some notations and conventions. Let us flx some notations and conventions. 
Topological conventions. The space of all continuous bounded functions on a topological 
space X is denoted by Cx and is equipped with the uniform norm ||/|| = sw^^^x \fi.^)\^f^ 
Cx- Unless specifled, its dual space C'^^ is equipped with the *-weak topology criC'p^, Cx)- 
Any Polish space X is equipped with its Borel cr-fleld and the set Vx of all the probability 
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measures on X is equipped with the narrow topology criVx, Cx) '■ the relative topology of 
C'x on Vx- While considering random probability measures, it is necessary to equip Vx 
with some cr-field: we take its Borel cx-field. 

Large deviations. Let {Kt}n>i be a sequence random variables taking their values in some 
topological space V equipped with some cr-field. One says that {Ki}„>i obeys the Large 
Deviation Principle (LDP) in V with speed n and rate function / : V ^ [0,cxd], if / is 
lower semicontinuous and for all measurable subset A of V, we have 

(1.6) - inf I{y) < liminf - logP(K e A) 

< limsup -logP(K e A) < - inf I{y) 

n^oo n vGclA 

where int A and cl A are the interior and the closure of A in V. 

To emphasize the parameter n, one says that this is a n-LDP. If p„ denotes the law of Vn, 
one also writes that {pn}n>i obeys the n-LDP in V with the rate function /. 
The rate function / is said to be a good rate function if for each a > 0, the level set 
{/ < a} is a compact subset of V. We shall equivalently write that I is inf-compact. 

1.8. Organization of the paper. At Section [2] the main results are stated precisely 
without proof. Their proofs are postponed to Section [61 They rely on preliminary results 
obtained at Sections H] and [5] where general large deviation results are derived for doubly 
indexed sequences of random probability measures with our optimal transport problems 
in mind. As a preliminary approach, Section [3] is dedicated to easier analogous large 
deviation results in terms of simply indexed sequences. Finally, Section [7] is an appendix 
dedicated to the proof of a result about the F-convergence of convex functions which is 
used in Section HJ Since we didn't find this result in the literature, we give its detailed 
proof. 
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2. Statement of the results 

The main result of the paper is Theorem 15. H it is stated in an abstract setting with 
general Polish spaces. In the present section, it is restated at Theorem 12.91 without proof 
in the particular framework of the optimal transport on W^. All the results of the present 
section are proved at Section El using the results of Sections [31 [H and \5\ 
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2.1. Some transportation cost functions are LD rate functions. Take a triangular 

array {zn^i G M"^; 1 < i < n, n > 1) in which satisfies 

1 " 

(2.1) lim-5^5,„,^=/i 

n— >oo n ' 

i=l 

for some fi G Pigd. 

For each z G M"^, let {U^}k>i be a sequence of M'^-valued random variables. For each 
A; > 1 and n > 1, take n independent random variables (X^ j(l))i<j<„ where 

(2.2) K^(^)'=Ul, 

For each k, {X^iil); I <i < n)n>i is a triangular array of independent particles which, 
in the general not identically distributed because of the contribution of the 

deterministic Zn/s. An important example is given by = z + U'^ with {U^}k>i a 
sequence of M'^- valued random variables. This gives for each fc, n > 1 

(2.3) Xl,{l)=zn, + Ul 

where (t/f )i<i<n are independent copies of . 

We are interested in the large deviations of the empirical measures on 

1 

(2-4) ^n = -E^<.(l) 

i=l 

as n first tends to infinity, then k tends to infinity. More precisely, doubly indexed LDPs 
in the sense of the following definition will be proved. 

Definition 2.5 (Doubly indexed LDP). Let Vx be the set of all probability measures 
built on the Borel cr-field of a Polish space X . The set Vx is equipped with the topology 
of narrow convergence and with the corresponding Borel a-filed. 

One says that a doubly indexed 'P;t'-valued sequence {L^}fc„>i obeys the (/c,n)-LDP in 
Vx with the rate function / : Vx [0, oo], if for all measurable subset B of Vx^ we have 

- inf liQ) < liminfliminf-^logP(L^ G 5) 

QeintB k^oo n^oo ku 

(2.6) < limsuplimsup--^logP(L^ G fi) < - inf I{Q) 

k — ^oo n — *oo QGc1_B 

where int B and cli? are the interior and closure of B in Vx- 

Assumptions 2.7. This set of assumptions holds for the present section and Section [61 

• (12. ip holds for some in P^d, 

• for each k>l, {Law{U^); z G M"^) is a Feller system in the sense of Definition 12.81 
below and 

• for each z G M'^, {U^}k>i obeys the fc-LDP in with the good rate function 
c,{u) e[0,oo], u eR'^. 

Definition 2.8. Let Z and X be two topological spaces. The system of Borel probability 
measures {Pz] z E Z) on X is a. Feller system if for all / in Cx, z E Z Ix f(-^) Pz{dx) G 
M is a continuous function on Z. 

The next theorem shows that the large deviations of {A^^} are closely related to optimal 
transport. 
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Theorem 2.9. The doubly indexed system {N^}k,n>i obeys the {k,n)-LDP in V^d with 
the rate function 

T{p) = u) 
for all V G PiRd, where the cost function is given by 
(2.10) c{xo,Xi) = c^,X^i), Xo,Xi e R'^. 

In the special case where Ii2.3\) holds and {U'^}k>i obeys the k-LDP in with the good 
rate function c : M'' — > [0, oo], we have c(xo,a;i) = c(xi — Xq), Xq,Xi G M*^. 

Examples 2.11. In the special case where (12.31) holds, we give some examples of {W^} and 
the corresponding cost function c. 

(1) With U'' = Y/y/k where Y is a standard normal random vector on M"^, we get 

c{u) = \u\^/2, u G R^. 

This is the usual quadratic cost function. 

(2) Let {Yrn)m>\ be a sequence of independent copies of a M'^-valued random vector 
Y which satisfies Ee"'^' < oo for some a > 0. With = ^ X]i<m<fc Cramer's 
theorem ([8], Corollary 6.1.6) states that {f/*^} obeys the /c-LDP in M'^ with the 
rate function c = : 

(2.12) c^(m) = sup{(C,M) -logEe^^'^^}, m G R-^. 

Observe that (1) is a specific instance of (2). 

(3) Let {Ym)ra>\ be as above and let a be any continuous mapping on . With 

= «(iEi<m<fc^™) we obtain c{u) = inf{c^(w);f; G R'^,a{v) = u},u e M.'^ 
as a consequence of the contraction principle. In particular if a is a continuous 
injective mapping, then 

Y -1 

c = c o a . 

(4) For instance, mixing (1) and (3) with a = ap given for each p > and v G M'^ 
by ap{v) = 2-^/P\v\^/P-^v, taking = {2k)-^/P\Y\^/P-^Y where y is a standard 
normal random vector on M.'^, we get 

c{u) = \u\P, u G W^. 
Note that ^= k'^/PYp where the density of the law of Yp is C\z\P/^~'^e-^'^\ 

Examples 2.13. We recall some well-known examples of Cramer transform c^. 

(1) To obtain the quadratic cost function c^('u) = \u\'^/2, choose y as a standard 
normal random vector in M*^. 

(2) Taking Y such that P(F = +1) = F{Y = -1) = 1/2, leads to 

r [(l+M)log(l + M) + (l-M)log(l-M)]/2, if-l<M<+l 

c^(u) = I log2, if M G {-1,+1} 

[ +00, if M ^ 

(3) If Y has an exponential law with expestation 1, c^(m) = u — 1 — logn if n > and 
c^{u) = +00 if u < 0. 

(4) If Y has a Poisson law with expectation 1, c^(m) = ulogu — u + 1 if m > 0, 
c^(0) = 1 and c^{u) = +oo if u < 0. 

(5) We have c^(0) = if and only if EY = 0. 

(6) More generally, c^('u) G [0, +oo] and c^('u) = if and only if -u = KY. 

(7) We have c''^+''{u) = c^((m - b)/a) for aU real a 7^ and 6 G R'^. 
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Examples 2.14. If EF = 0, is quadratic at the origin since c^(u) = {u, Ty^u) /2 + o{\u\'^) 
where Fy is the covariance of Y. This rules out the usual costs c{u) = \u\^ with p ^ 2. 
Nevertheless, taking Y a real valued variable with density C exp{—\z\P /p) with p > 1 leads 
to c^{u) = \u\^/p{l + o\u\^oo{^))- The case p = 1 follows from Example l2.13l -(3) above. 
To see that the result still holds with p > 1, compute by means of the Laplace method 
the principal part as ( tends to infinity of e~^^ ^^e'^^ dz = A/27r(g — l)(^-^^''/^e^''/'^(l + 
Of^+oo(l)) where l/p+ l/q = l. 

Of course, we deduce a related (i-dimensional result considering Y with the density 
Cexp(-|2;|P/p) where \z\l = J2i<d This gives c^{u) = \u\P/p{l + 0|„|^oo(l))- 

The drawback of the specific shape of any Cramer's transform (see Examples 12.141) 
is overcome by means of a continuous transformation as in Examples 12 . 1 1 1 - ( 3 & 4). 



2.2. Convergence results. The structure of (12.61) suggests that a (/c,n)-LDP may be 
seen as the limit as k tends to infinity of n-LDPs indexed by k. This is true and made 
precise at Proposition 12. 191 and Theorem 12.201 below. 

Let us have a look at the n-LDP satisfied by {A^^}n>i with k fixed. It is very similar to 
the n-LDP of Sanov's theorem, see Proposition 12.191 below. The only difference comes 
from the contribution of the initial positions Zn,i which make (X^j(l)) a triangular ar- 
ray of non-identically independent variables. Recall that Sanov's theorem ([8], Theorem 
6.2.10) states that the empirical measures ^"=1 ^x^}n>i of a sequence of independent 
P-distributed random variables taking their values in a Polish space X obey the n-LDP 
in Vx with the rate function 

(2.15) QeVx^ H{Q\P) = I {§) dQ if g ^ P 

1^ +00 otherwise. 

H{Q\P) is called the relative entropy of Q with respect to P. 

Consider now the random empirical measures on M?'^ = M'^ x M"^ which are defined by 

1 " 

(2-16) ^n = -E^(^.,.^^.(l)) 

i=l 

for all fc, n > 1. Clearly, A^^ is the second marginal of M^. Denote for each k>l 
(2.17) T:^{dxQdxi) = fi{dxo)Law{U^ ){dxi) G 



This means that ir'^ = Law{X{0), X^{1)) where X{0) is a R'^-valued random variable 
which is //-distributed and P(X'^(1) G dxi \ X{0) = xq) = Law{U^J{dxi). Define 



and 



^ki.p) { otherwise ' 



1 



(2.18) Tu{u)= inf fif(p|7r'=), u eV^.. 

Proposition 2.19. For each fixed k > 1, 

(a) {M^}n>i obeys the n-LDP in V^2d with the good rate function kSk and 

(b) {N^}n>i obeys the n-LDP in V^d with the good rate function kT^. 
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The order of magnitude of H{p\iT^) is k, since {U''} obeys a A;-LDP. The rescaled entropy 
Sk is of order 1. If it exists, hm^^oo •S'^ may be interpreted as a specific entropy (see j23]). 
It happens that Sk and F-converge. The limit of Sk is 

where f^2d cdp = f^2d c{xo, Xi) p{dxodxi) and c is given at fl2.10p . 

Theorem 2.20. We have 

(a) F-limfc^oo Sk = S in V^2d and 

(b) r-limfc^ooT'fc = T in V^d. 

These limits will allow us to deduce the following approximation results. Recall that 
the minimization problems (MK^), flMK"P and flMKp are defined at Section [TT^ 



Theorem 2.21. Assume that Tc{p,v) < oo. 

(a) We have: lim^^oo linifc^oo infpgno(M) il^ipl'^'') + Oid{pi, i^)} = Tc{p, v). 

(b) For each k and a, (MK1 ) admits a unique solution in V^2d. For each a, (p^)fc>i 



is a relatively compact sequence in V^2d and any limit point o/(p^)fc>i is a solution 
of [MW\) . 

(c) For each a, liMK°'\] admits at least a (possibly not unique) solution p°. The sequence 
(p")a>i is relatively compact in V^m and any limit point of {p"')a>i is a solution 
ofiMM- 

2.3. The proofs. The proofs of these announced results are done at Section [61 Theorem 
12.91 is part of Theorem 16. 2[ Proposition 12.191 is Lemma 16.11 Theorem 12.201 is Theorem 16.51 
and Theorem 12.211 is Theorem 16. 6[ 

3. Large deviations of a simply indexed sequence of random measures 

As a warming-up exercice, let us first consider a usual sequence of random measures. 

We present an abstract setting instead of the situation described at Section [2l Let X 
and Z be two Polish spaces which play respectively the part of the space of "paths" M^*^ 
and the space of initial conditions M'^. The cost of this extension is quite low: the main 
property of Polish spaces to be used later is that any Borel probability measure is tight. 

Take a triangular array {zn,i & Z;l < i < n,n > 1) on Z such that the sequence of 
empirical measures Pn = ^ J27=i i ^ 'Pz satisfies 

(3.1) lim pn = 1^ 

for some probability measure p G Vz- Let (Pj G Vx', z E Z) he a. collection of probability 
laws on X which is assumed to be a Feller system in the sense of Definition 12. 8[ 
We work with a triangular array of independent ^Y- valued random variables (X„ j; 1 < i < 
n,n > 1) where for each index [n, i) the law of X„ j is P^^ .. This means that for all n > 1, 

Caw{Xn,i] 1 <i <n) = ®i<i<nPz„y 

Proposition 13.141 below states a LDP in Vx for the empirical measures 

1 " 

as n tends to infinity. It is a variant of Sanov's theorem which has already been studied 
by Dawson and Gartner in [7] and revisited by Cattiaux and Leonard in [6] . Nevertheless, 
the expression (I3.15P of the rate function doesn't appear in these cited papers. The proof 
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of Proposition 13. 141 will be done as a first step for the proof of the LDP of a doubly indexed 
sequence: most of its ingredients will be recycled at Section HI 

Notations. We write shortly Vzx and Czx for Vzvx and Czyx- The dual space C'^x of 
{Czxi II • II) is equipped with the *-weak topology aiC'^xiCzx)-, see Section [L71 
Let {Z,X) be the canonical projections: Z{z,x) = z,X{ ) e ZxX. For any 

Q £ Vzx, we write the desintegration 

q{dzdx) = qz{dz)q^{dx) 

where qz{dz) = q{Z G dz) is the (marginal) law of Z under q and q^{dx) = q{X G dx \ 
Z = z),z & Z, is a regular conditional version of the law of X knowing that Z = z. We 
also define p G Vzx by 

p{dzdx) = n{dz)Pz{dx). 

The LDP for {L„}„>i will be obtained as a direct consequence of the contraction 
principle applied to some LDP for the sequence of P^a"- valued random variables 

1 " 

i=l 

Proposition 3.2. Suppose that \3. 1\) holds for some fi in Vz and that [P^; z E Z) is a 
Feller system. Then {Kn}n>i obeys the LDP in Vzx with the good rate function 

(3.3) h{q) := ( ^^^^'^^ = ^^^^'^^^ ^^^^^ ^{f , g G Vzx 

^ ^ v-:!/ 1^ otherwise ^ 

Proof. For all n and all F G Czx, the normalized log-Laplace transform of Kn is 

7/'„(F) := -logEexp(n(F,ir„)) 

n 



1 " 

- VlogEe^ 



n 



i=l 



£log(e^%P,)/i„(dz). 



As (/i„)„>i converges to /i and {P^; z E Z) is a. Feller system, for all F G Czx we have 
the limit: 

iP{F) := lim ^,(F) 

71— »00 

(3.4) = jjog{e^%P,)fi{dz). 

Following the proof of Sanov's theorem (see [8j, Section 6.4) based on Dawson-Gartner's 
theorem on the projective limit of LD systems (see [7J, Section 3), one obtains that {Kn} 
obeys the LDP in (7^;^- with the rate function 

(3.5) r{q) = snp {{F,q)- [ \og{e^% P,)ix{dz)], q E C'^^- 
It is proved at Lemma [3.71 below, that for all q G C'^x, 

r(q)-=l ^^^^ ^i^l^Vzx 
^ ' 1 +00 otherwise 

It follows that {Kn}n>i obeys the LDP in Vzx with the rate function h. 
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It remains to note that as the relative entropy is inf-compact and {q G Vzx] Qz = A*} 
is closed, h is also inf-compact: it is a good rate function. □ 

As a by-product of this proof, we have the following corollary which is mentioned for 
future use. 

Corollary 3.6. [Hypotheses of Proposition \3T^ . The random system {Kn}n>i obeys the 
LDP in C'^p^ with the rate function ip* given at Ii3. 5\) . 

During the proof of Proposition 13.21 we have used the following lemma. 

Lemma 3.7. With ip*{q) defined hy formula ^3. 5]) we have 

^*(q) = I ^(^1^) = e Vzx and qz = 

\ +00 otherwise. 

Proof. The proof is twofold. We show that 

(i) for all q G C'^^, 4'*{q) < +oo implies that q belongs to Vzx and its 2;-marginal is 
qz = /i- 

(ii) Then, we show that for all q G Vzx such that qz = /i, we have 4'*{q) = H{q\p). 
Let q G C^^ be such that 

sup {{F,q)-iP{F)} = riq)<oo. 

F^Czx 

Let us begin with the proof of (i). 

• Let us show that g > 0. Let Fo G Czx be such that Fo > 0. As ip{aFo) < for all a < 0, 

ip*{q) > sup{a(Fo,g) - V^(aFo)} 

a<0 

> sup{a(Fo,g)} 

a<0 

0, if {Fo, q) > 
-|-oo, otherwise 

Therefore, as 4'*{q) < oo, {Fo, q) > for all Fo > 0, which is the desired result. 

• Let us show that (1, q) = 1. For any constant function F = c G M, we have ip{cl) = c. 
It follows that 

ilj*{q) > sup{c(l,g) - ^/'(cl)} 

> sup{c((l,g)-l)} 

0, if(l,g) = l 
+00, otherwise 

from which the result follows. 

• In order to prove that q is a-additive, we have to prove that for any sequence {Fn)n>i in 
Czx such that Fn > for all n and {Fn{z, x))„>i decreases to zero for each {z, x) G ZxX, 
we have 

(3.8) lim(F„,g) = 0. 

For such a sequence, one can apply the dominated convergence theorem to obtain that 

lim ipi^aFn) = 0, 
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for all a > 0. It follows that for all q G C'^xi 

ip*{q) > suplimsup{a(F„,g) -?/;(aF„)} 

a>0 n— >oo 



> sup I limsup a(F„, g) — lim ip{aFn) 

a>0 V n^oo n^oo 

= sup a lim sup g) 

a>0 n— >oo 

if limsup„^^(F„,g) < 
+00 otherwise. 

Therefore, as 4'*{q) < oo, we have \imsup^^^{Fn,q) < 0. Since we have just seen that 
g > 0, we have obtained fl3.8l) . 

This completes the proof of g G Vzx since we have proved that any g G C'^x such that 
4'*{q) < oo is nonnegative, has a unit mass and satisfies (13.81) . Therefore, g is uniquely 
identified with a probability measure on the Polish space Z x X (see [18] , Proposition 
II-7-2). 

• To complete the proof of (i), it remains to show that for any g G Vzx, < oo 

implies that qz = /i. Indeed, choosing F{z,x) = g{z) not depending on x with g G C^, 
one sees that 

i)*{q) > snp {{g,qz) - {g,fi)} 

^ f 0, if g^: = /i 
+00, otherwise 

which gives the announced result. 

Now, let us show (ii). For all g G Vzx such that g^ = or equivalently such that 
q{dzdx) = fi{dz)q^{dx), we have 

r{q) = sup [ m,q')-\og{e''%P,))fi{dz) 

FeCzx JZ 

< [ sup{(/,g^)-log(e^,P,)}/i(t^^) 

Jz fdCx 



(a) 



Z f€Cx 

H{q'\P,)fi{dz) 



(3.9) H{q\p) 

where equality (a) follows from the well-known variational representation of the relative 
entropy in a Polish space X 

(3.10) HiQ\P) = sup I [ fdQ- log [ eUp] , P,QeV. 

f&Cx VJx JX ) 

and equality (b) follows from the tensorization property 

(3.11) H{q\p) = H{qz\pz) + / Hiq^lp^ qz{dz) 



X 



since pz = qz = IJ' and p'' = p{- \ Z = z) = Pz- Note that z ^ H{q^\Pz) is measurable. 
Indeed, {Q,P) ^— > H{Q\P) is measurable as a lower semicontinuous function and z i— 
{q^,Pz) is measurable since its coordinates are measurable: z ^ q^ is measurable as a 
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regular conditional version in a Polish space and z is assumed to be continuous. 

We have just proved that ip* < h. 

The converse inequality follows from Jensen's inequality: ipi^) ^ log Iz^^^"" ^ ^{dz) = 
\og{e^,p) for all F G Czx- Indeed, taking the convex conjugates leads us for all q G Vzx 
to 

ip*{q) > sup -log(e^,p)} 

FeCzx 

(3.12) = H{q\p) 

This equality is fl3.10p . This completes the proof of the lemma. □ 

Remark 3.13. The || ■ ||-continuity of q in C^;f didn't play any role in the proof. Only its 
linearity has been used. 

Now, we investigate the large deviations of 



1 " 
n ^-^ 



Let us denote the A'-marginal of p by 

P[dx) = j P,{dx)fx{dz) G Vx- 

Proposition 3.14. Suppose that 1^3. 1\) holds for some fi in Vz and that [Pz] z & Z) is a 
Feller system. 

(a) Then, {Ln}n>i obeys the LDP in Vx with the good rate function H which is defined 
for all QeVx by 

(3.15) H{Q) = inf 1^ i^(n,|P,) /i(dz); (n,),^^ : ^ fi{dz) = Q 

where the transition kernels z E Z ^11^ ^ "Px ore measurable. 

(b) If H{Q) < +00, there exists a unique (up to fi-a.e. equality) kernel (Il*)z&z which 
realizes the infimum in Ii3.15\) : H{Q) = H(Ill\Pz) ^{dz). 

(c) // in addition the Feller system {Pz)zez satisfies 

Pz = P(- I /3(X) = z) 

for ^-almost every z E Z and some continuous function P : X ^ Z, we have for 
all QeVx 



TT/Q\ _ [ H{Q\P) if Q satifies l3oQ = n 
+00 otherwise. 



and the minimizing kernel (11*) of l{3.15\) is 11* = Q{- \ P{X) = z), for fi-almost 
every z. 

Proof. Let us prove (a). As L„ is the Af-marginal of Kn and {Kn} obeys the LDP 
with a good rate function, the statement (a) follows from the contraction principle (see 
[H|, Theorem 4.2.1): {Ln} obeys the LDP in Vx with the good rate function H{Q) = 
inf {h{q); q G Vzx ■ qx = Q} which is (13.151) . 

The statement (b) immediately follows from the strict convexity and the inf-compactness 
of g H- > H{q\p) which is restricted to the closed convex set {q G Vzx '■ Qz = f^,<lx = Q}- 
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Let US prove (c). To do this, we rewrite the proof of Proposition 13.21 with instead of 
Kn- We obtain that obeys the LDP in Vx with the rate function 



(3.16) 



^*{Q)= sup ((/,g)- / \og{ef, P,)ii{dz 

f&Cx I Jz 



This equahty is (13. 5p where we replace g by Q and F{z,x) by /(x). Choosing / G Cx of 
the form f = g o (3 with g in Cz gives us for all Q G Vx 

^*iQ) > snp{{g,PoQ)-{g,fi)} 

if /3 o Q = /i 
+00 otherwise 

It follows that \E'*((5) < oo implies that f3oQ = fi. For such a Q, as in the proof of inequality 
M . we obtain the inequality in ^*{Q) < J^H{Q{-\Z = z) \ P,) fx{dz) = H{Q\P). This 
last equality follows from the tensorization property of the relative entropy, see (13.111) . 
This proves that < H. The converse inequality follows from Jensen's inequality exactly 
as in the proof of inequality (I3.12p . We have shown that 

(3.17) = H. 

The last statement about the minimizing kernel is a direct consequence of the tensoriza- 
tion formula (13.111) : 



inf 1^ H{U,\P,) fi{dz); (n,),^^ : £ /i(rfz) = Q 
= H{Q\P) 

= H{Qz\Pz) + j H{Q{-\Z = z)\P{-\Z = z))Qz{dz) 

= j^H{Q{-\Z = z)\P,)fi{dz) 

where the first equality follows from (a) and the first part of this statement, and the last 
equality follows from H{Qz\Pz) = H{fi\fi) = 0. □ 

Remark 3.18. The identity (I3.15P is a formal inf-convolution formula and (I3.16P is its 
dual formulation: "the convex conjugate of an inf-convolution is the sum of the convex 
conjugates" . 

Remark 3.19. Statement (c) holds true also when /? is only assumed to be measurable. 
Indeed, (I3.16P can be strengthened by 

^*(g)= sup \{f,Q)- [ \og{ef,P,)fx{dz)]= sup \{f,Q)- [ \og{ef , P,) fiidz)] , 

for all Q G Vxi where B{X) is the space of all measurable bounded functions on X. 
For the second equality, note that in the proof of Proposition 13. 2[ taking the test func- 
tions F{z, x) bounded, z-continuous and x-measurable (instead of x-continuous), does not 
change anything except that in the expression of the rate function supp{(F, g) — ip{F)}, 
the sup is taken over this larger space instead of Czx- As the rate function is unique, the 
sup over these two spaces is the same. A similar argument in the present situation leads 
to supjgc*^ = supjg^(-;j.) . Finally, choosing / G B{X) of the form f = g o p with g in Cz 
gives us '^*{Q) > supg^Q^{{g , P o Q) — {g,fj,)} and one concludes as in the previous proof. 
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4. Large deviations of a doubly indexed sequence of random measures. 

Preliminary results 

We keep the abstract Polish spaces Z and X of Section [21 as well as the triangular array 
j G 1 < i < n, n > 1) which satisfies fl3.ip . For each /c > 1, we consider a Feller 
system [P^ G Vx] z & Z) oi probability laws on A" and a triangular array of independent 
A'-valued random variables (X^^; 1 < i < n,n > 1) where for each index (n, z) the law of 
• is . This means that for all fc, n > 1, 

11,1 <"n.,i 

Caw{X^^i- l<i<n) = ®i<i<nPty 
The main result of the next Section [5] states the (A;,n)-LDP in Vx for 

1 " 

i=l 

As in Section [31 this LDP will be obtained by means of the contraction principle applied 
to some LDP for the P^A'-valued random variables 

1 " 

i=l 

The main result of the present section is Theorem I4.9[ It states the (fc, n)-LDP for 

We also assume that for each z & Z, {P^)k>i obeys some fc-LDP in X with rate function 
Jz- This means that for each k and all measurable subset A of X 

- inf J^(x) < liminf ilogP^''(A) 

xGintA fc— ►oo K 

< limsup i logP,,''^^) < - inf J^(x) 

where int A and clA are the interior and the closure of A in A". This plays the part of 
Cramer's theorem and its transformations at Section [21 see Examples 12.111 with Jz{x) = 
c^(xi — Xq), X = {xo,Xi) G M^'^, z & M.'^ if Xq = z and +oo otherwise. 

4.1. Preliminary results. Before proving the (A;,n)-LDP for {K^} at Theorem 14. 9 [ we 
need some preliminary results. The following lemma is Corollary 17.41 its detailed proof is 
given at Section [71 

Lemma 4.1. Let (JF, || ■ ||) be a normed vector space and Q he its dual space. Let A, A^, 
k > 1 be real-valued convex functions on T such that 

(a) limfc^oo ^kiF") = A(-F) for all F & T and 

(b) there exists c > such that sup;,>;^ |Afc(F)| < c(l + \\F\\) for all F ^ T . 
Then, the convex conjugates XI of Xk, T-converge to the convex conjugate X* of X : 

r- lim Xl{q) = X*iq) 

fe— >oo 

for all q E Q, with respect to the *-weak topology o"(Q, JF). 

The following lemma is proved in |14j . 

Lemma 4.2. Suppose that for all k > 1, {/i^}n>i obeys a weak n-LDP with rate function 
kl^ and also suppose that the sequence {I^)k>i T-converges to some function L Then, 
{Hn}k,n>i obeys a weak {k,n)-LDP with rate function L 
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Proof. See [m. □ 
We define for each k and all F G Czx, 

A,(F) = i£log(e^^%n^)/i(rfz) 
(4.3) A(F) = snp{F{z,x) - J,{x)} fi{dz). 

Note that z ^ sup^g;^'{-F(z, x) — Jzix)} is measurable since it is the pointwise limit of 
continuous functions: see (14. 5p below, so that A(F) is well-defined. 
Observe that A^ is a normalized version of the function if) defined at (13.41) . 

Lemma 4.4. We assume that for each z & {Pz )k>i obeys the k-LDP in X with the 
good rate function Jz- Then, for all F G Czx we have 

(4.5) lim i log(e^'^% P^) = sup{F{z, x) - J,(x)}, 

fc-»oo k 

(4.6) lim Afc(F) = A(F) and 

k—*oo 

(4.7) sup|A,(F)| < |A(F)| < ||F||, VF G C^;,. 

k 

The functions A^ and A are convex and cr{Czx, C2x)-lower semicontinuous . . 

Proof. Thanks to the assumption on {P^)k>i, by Varadhan's integral lemma (see [8], 
Theorem 4.3.1), as Fz is continuous and bounded and Jz is assumed to be a good rate 
function, for all z we have (14. 5p . 

As for all A; > 1, z G Z and F G Czx, we have \l\og{e''^% P^)\ < \\F\\, with (gS]) we 
see that 



(4.^ 



sup{F(2;,x) — Jz{x)} 

x<^X 



< \\F\ 



These estimates allow us to apply Lebesgue dominated convergence theorem to obtain 
dM]) and fHTj) . 

For each /c, Afc is convex since / t— > log(e'^'-^, Pz) is convex as a log-Laplace transform and 
/i is a nonnegative measure. As a pointwise limit of convex functions, A is also convex. 

The convex functions A'^ and A are cr(C^A:', C2;t')-lower semicontinuous if and only if 
they are || ■ || -lower semicontinuous on Czx- But, because of (14. 7p . these convex functions 
are || ■ ||-continuous on the whole space Czx- A fortiori, they are lower semicontinuous. □ 

4.2. The (A;,n)-LDP for {K^}. Let us introduce the convex conjugate of A : 

A*(g) = sup \{q,F)~ sup{F{z,x) - Jz{x)} fi{dz)\ , qeC'zx- 
FeCzx I Jzxex J 

It will appear during the proof of Theorem 14. 91 that it is the rate function of the {k, n)-LDP 
satisfied by {K'^}k,n>i- 

Theorem 4.9. Suppose that 

(1) (/Xn)n>i converges to /i in Vz, 

(2) for each k > 1, {Pz', z & Z) is a Feller system in the sense of Definition \2. g|. 

(3) for each z & Z^ {Pz)k>i obeys the k-LDP in X with the good rate function Jz- 
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Then {K^}k,n>i obeys the {k,n)-LDP in Vzx with the affine good rate function 



(4.10) ^{q) := ( '{l^ , g G V^,. 
^ ' y _|_^ otherwise 

Proof. The framework of the proof is the same as Proposition l3.2f s one, but it is technically 
more demanding. 

For all k,n> 1 and all F G Czx, the normalized log-Laplace transform of K!^ is defined 
by 

For fixed k, considering the limit as n tends to infinity and taking assumptions (1) and 
(2) into account gives 

hm \1{F) = Afc(F). 

n— ►oo 

By Corollary 13.61 we see that for all k, {K^}n>i obeys the n-LDP in C'^x with the rate 
function XI/ k. 

Because of Lemma l4.4l and Lemma l4m applied with JF = Czx and Q = C'^x-, the pointwise 
convergence (14.61) and the estimate (14. 7p imply that 

(4.11) r- lim \l = X* 
in C'zx- 



By Lemma |4.2[ this F-convergence implies that {Kl^}k n>i obeys a weak (/c, n)-LDP in 
C'zx with the rate function A*. It is proved at Lemma [4.131 below that 

(4.12) {X* <+oo} cVzx. 

A fortiori, {A* < +00} is included in the strong unit ball 

Uzx=\qeC'^x;hr-= sup (g,F)<l 

[ FeCzx,\\F\\<l 

of C'2x which is cr{C2x, C2A')-compact (Banach-Alaoglu theorem). Consequently, {K^}k,n>i 
obeys a strong (/c, n)-LDP in Uzx with the topology cr{Uzx,Czx) and the rate function 
A*. With (I4.12P again, we obtain that {K!^}k,n>i obeys the (/c, n)-LDP in Vzx with the 
rate function A*. 

Let us show that the restriction of A* to Vzx has ct(Vzx, C2A')-compact level sets. As 
a convex conjugate. A* is (j(C^;v^, C2A')-lower semicontinuous. Therefore, for all real a, 
{A* < a} is (jiC'^xj C2A')-closed. But, (I4.12p implies that {A* < a} is included in the 
'^i.Czx^ C2A')-compact unit ball Uzx- Hence, {A* < a} is (j{C2x, C2A')-compact and by 
(I4.12P again, the restriction of A* to Vzx is criVzx, C2A')-inf-compact. 

Finally, it will be proved at Proposition 14.141 that the restriction of A* to Vzx is i- This 
completes the proof of the theorem. □ 

4.3. Identification of the rate function A*. It remains to show that A* = i. This is 
the most technical part of the paper. 



Lemma 4.13. Under the assumptions of Lemma 4-4' the following statements hold true. 

(a) For all q G C'^x, < 00 implies that q G Vzx- 

(b) For all q G Vzx, X*{q) < 00 implies that qz = fJ>- 
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Proof. It is similar to the proof of Lemma [3. 7[ As in Lemma [321 the || • ||-continuity of q 
doesn't play any role, see Remark 13. 131 Let q G C^/t;- be such that 

sup {(F,g)-A(F)} = A*(g)<oo. 

An inspection of Lemma [32fs proof shows that, to prove that q G Vzx, it is enough to 
check that A satisfies 

(i) for all a < and all nonnegative Fo G Czx, A(aFo) < 

(ii) for any constant function F = c G M, A(cl) = c 

(iii) for any sequence {Fn)n>i in Czx such that -F^ > for all n and {Fn{z,x))n>i 
decreases to zero for each {z,x) E ZxX, we have, lim^^oo A(aF„) = 0, for all 
a > 0. 

(i) As Jz{x) > for all z and x, and /i > 0, we have X{aFo) < for all a < and all 
nonnegative Fo G Czx- 

(ii) As inf^g^t- Jz{x) = for all z and is a probability measure, for any constant function 
F = c G M, we have A(cl) = c. 

(iii) By Lemma 14.261 below, for all z E Z, sup ^^;,^{Fn{z, x) — Jz{x)})^^^ is a decreasing 
sequence and lim„^oo sup^g;\^{F„(2;, x) — Jz{x)} = 0. As \sup^^p^{Fn{z,x) — Jz{x)}\ < 
sup^a; \Fi{z,x)\ < oo for all n and z, one can apply the dominated convergence theorem 
to obtain that lim„^oo A(a-F„) = 0, for all a > 0. 

This completes the proof of statement (a). 

Let us prove (b). Choosing F{z,x) = g{z) not depending on x with g G Cz in the 
expression of A*, one sees that for all q G Vzx 

A*(g) > sup {{g,qz) - {g,fJ^)} 
g&Cz 

^ f 0, if qz = 1^ 
+00, otherwise 

which gives the announced result and completes the proof of Lemma 14.131 □ 

The very technical result of this section is the following Proposition I4.14[ During its 
proof, we need some lemmas whose statements are included in the body of the proof. The 
proofs of these lemmas are postponed to the next subsection 14.41 

Proposition 4.14. For all q G Vzx, A*(g) = i{q). 

Proof. Thanks to Lemma l4.13l -b. to prove that A* = i, we have to show that A*(g) = 
Izxx q{dzdx) for all q G Vzx such that qz = or equivalently such that 

(4.15) q{dzdx) = fi{dz)q^{dx) 

where 



q^'i^dx) = q{X e dx \ Z = z). 
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For such a g we have 

X*{q) = sup [{F„q^) - sup{F^{x) - Jz{x)}]fi{dz) 



< / sup -sup{/(2;) - J^(a;)}]/i((iz) 

'Z f€Cx x&X 



(a) 



Jz{x) q^{dx)n{dz) 

ZxX 



i{q) 

where equahty (a) is given at the following Lemma 14.161 and equality (b) follows from 

dESD. 

Lemma 4.16. Let J be a [0, +oo]-valued lower semicontinuous function on X. For all 
Q G Vx, we have 

sup < / / (iQ — sup(/(x) — J(x)) > = / J dQ. 
f€Cx IJx xex ) Jx 

The proof of this lemma is put back after the proof of the present proposition. 

Note that z ^ {Jz,q^) is measurable since z Jz{x) is assumed to be continuous for 
all X and z i— is a regular version of the desintegration of q. 

It remains to show the converse inequality: X*{q) > i{q) for all q satisfying (14.151) . As 
a first step, we would like to invert a sup and an integral to obtain 

\*{q) = sup [{F„q') -sup{F,{x) - J,{x)}]fi{dz) 

F&Czx) Jz x(^X 



(4.17) = / sup[(/,g^)-sup{/(x)- J,(x)}]/i(t^^) 

Jz feCx xdX 

As a first step, we are going to prove this equality under the restrictive assumption that 
X is compact. Its proof relies on the following result which is due to R. T. Rockafellar 
(see I20j, Theorem 2). 



Lemma 4.18. Let (2,/i) he a measure space such that fi is a-finite. Let L be a decom- 
posable space (see below for the definition) of measurable functions F on Z with their 
values in a Polish space y equipped with its Borel a -field. Let 9 : Z ^ y ^ [— oo, oo) be 
such that 

- 9 is jointly measurable 

- 9 is not identically equal to —oo and 

- y ^ 9{z, y) is upper semicontinuous for all z E Z. 

In this case, one says that —9 is normal. Suppose in addition that there exist some 
Fi E L and some Ui G L^{fi) such that 9{z,Fi{z)) > Ui{z) for fi-almost every z in Z. 
Then, z t— > sup^^g-y 6'(z, y) is measurable and 

sup / 9{z, F{z)) fi{dz) = / sup 9 {z, y) fi{dz) E {—oo,oo]. 
F&L J z Jz y&y 

Definition 4.19. The space L is said to be decomposable if, whenever F belongs to L 
and Fo : Zo ^ y is a, bounded measurable function on a measurable set Z^ d Z oi finite 
measure, the function z lzeZoFo{,z) + lz(^ZoF{.z) also belongs to L. 

In order to obtain (14.171) . we would like to apply this lemma with 
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• 3^ = Cx equipped with the topology of uniform convergence, 

• ^{z, f) = (/, q') - snp^^xifi^) - Jz{x)} and 
. L = Ct{Z,Cx)^Czx. 

Unfortunately, two troubles occur. 
Trouble 1: If A" is not compact, y = Cx is not separable and fails to be a Polish space 
as required in the lemma. On the other hand, if X is compact, C^t" is Polish. 
Trouble 2: The space Czx — Cb{2, Cx) is not decomposable. On the other hand, the 
space B{Z,Cx) of all bounded and measurable functions F : z E Z E Cx is 

decomposable. 

Note that when X is compact, as Cx is separable, we have B{Z, Cx) — T where T is the 
space of all the functions ow Z^X which are bounded, x-continuous and 2;-measurable; 
such functions are jointly measurable. 
We are going to apply Lemma 14.181 with 

• y = Cx and X a compact Polish set, 

• Q{z, f) = (/, g^) -su p^g_y{ /(x) - Jz{x)} for ah z e Z and / G Cx, where q G Vzx 
is fixed and satisfies (I4.15P and 

• L = J^. 

As f ^ 6{z, f) is continuous for all z and z i— > 6{z, f) is measurable for all /, 6 is 
jointly measurable. Taking / = gives ^(-2,0) = > — oo for all z, so that 6 shares all 
the normality conditions of the lemma. 

Choosing the functions Fi = E L and Mi = G L^{fi) leads us to = 9{z,Fi{z)) > 
Ui{z) = for every z in Z. 

Therefore, we have shown that all the assumptions of Lemma 14.181 are met so that 

sup / [{Fz,q^) - sup{F;,(x) - Jz{x)}] n{dz) 

FgtJz x€X 

(4.20) = / snp[{f,q')-snp{f{x)-Ux)}]fi{dz) 

J z feCx x(^x 

whenever A" is a compact Polish space. 

To obtain fl4.17l) . it remains to prove that for all q with qz = fJ^, 

(4.21) X*{q) = sup / q') - sup{F,(x) - J,(x)}] fiidz) 

F&r J z x&x 

Let us prove it without assuming that X is compact. Rather than invoking an abstract 
approximation argument, we present a specific proof of fl4.2ip . Rewriting the above proof 
of Theorem 14.91 with Czx replaced with the space B{Z x X) of bounded measurable 
functions on Z^X one gets the following result. 

A variant of Theorem 14. 9T Assuming (2) and (3) of Theorem \4.9[ if Assumption (1) 
is strengtnened by "(/in)n>i converges to /i in Vz for the stronger topology criVz, B{Z))", 
then {K!^}k^n>i obeys the {k,n)-LDP in Vzx with the topology a(Vzx, B{ZxX)) and the 
rate function i{q) = supp^^^^^xx) IzK^^^ l') " ^^Pxex{Fz{x) - Jz{x)}] ^{dz), if q E Vzx 
satisfies qz = and i{q) = +oo otherwise. 

For any /i G Vz, there exists a sequence of empirical measures (/in)n>i as in fl3.ll) 
which converges to ^ with respect to the topology a{Vz, B{Z)). This can be seen as 
a consequence of the almost sure convergence, as n tends to infinity, of the empirical 
measures ^ J2i<i<n of the yU-iid sequence of Z-valued random variables (Zj)j>i towards 
II for the topology a(Vz, B{Z)) which in turns is a corollary of the strenghened version 
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of Sanov's theorem with the topology cr{Vz, B{Z)) on a Pohsh space Z. With such a 
sequence (/i„)„>i, by Theorem 14.91 and its variant, {K^}k,n>i obeys the (A;, ?t,)-LDP in 
Vzx with the rate functions A* and i. As the rate function of a LDP is unique in a regular 
space (for the double index version of this known result, see [H]), we have A* = i. It 
follows that for all q with qz = fi, 

X*{q) := sup I [{F„q') -sup{F,{x) - J,{x)}]fi{dz 



sup / [{F„ q') - sup{F^(x) - J,{x)}] , 

&Czx J Z xeX 

sup / [{Fz, q"") - sup{F2(x) - Jz{x)]] ^{dz) 

B(ZxX) Jz xeX 



F£B{ZxX) 

which implies the desired equality (14.211) . 

Thanks to (14.201) and (14.211) . we have proved (I4.17P whenever X is compact. Neverthe- 
less, the identity (I4.17P will not be used directly. We shall only use (14.211) and a variant 
of (11201). 

Now, we have to tackle the problem of relaxing the requirement that X is compact. 
Let us take advantage of the tightness of q (it is a probability on a Polish space). This 
means that there exists an increasing sequence (/C^)„>i of compact subsets of ZxX such 
that q{)Cn) — 1 — 1/'"' for all n > 1. As a continuous image of a compact set, := {x G 
X; {z,x) G /C^ for some z G Z} is a compact set. We also have q{Z x X^) > 1 — 1/n 
for all n. It follows that for qz-aAmost every z & Z, q^ is determined by the values (/, q^) 
where / describes the set IJn>i ^i^n) where, for any measurable set Xo in X, we denote 

(4.22) C{X,) = \x.Cx = {/ : A' ^ R; / = l^J . for some / G Cx}. 

To see this, remark that for all measurable set Am. X such that A fl (UnA"^) = 0, we 
have q\A) fi{dz) = q{Z x A) = lim„_^oo q{Z x (A n X^)) = 0. 

We can now proceed with the proof of A*(g) > i{q) for all q satisfying (I4.15|) . For all 
such q we have, 

A*(g) = sup / q') - sup{F,(x) - J.(x)}] fi{dz) 
FeTJz x&x 

sup sup / [(F^, g^) - sup{F^(x) - Jz{x)}] ^i{dz) 
n>i f&b[z,c(x^)) Jz x&X 

sup / sup [(/, q"") - sup{/(a;) - J^(x)}] ii{dz) 

n>l Jz f€C(X^) xeX 



(a) 
> 



(6) 



(d) 



■ fec{x^i) 

sup [(/, q"") - sup{/(x) - Jzix)}] fi{dz) 
2/eU„>iC(^«) xex 

Jz{x) q^{dx)fi{dx) 

ZxX 



= i{q) 

where the first equality is (I4.2ip . The remaining series of inequality and equalities needs 
to be justified. This will require two more lemmas the proofs of which are postponed after 
the proof of the present proposition. 
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• Inequality (a). It is enough to show that for any function F G B{Z,C{Xo)) with Xo 
a compact subset of Af, there exists a sequence (-F")n>i in ^ such that 

(4.23) I q^) - sup{F,(x) - J,(x)}] ^l{dz) 

J z xex 



hm / [(F;, g^) - sup{F;(x) - J.(x)}] /x(rfz). 



Let us show that 

F;(x) := sup{F,(y) - nci(a;, y)-yeX} 

does this job. For each z, (— F^"')„>i is the Moreau-Yosida approximation of — -F^, and it 
is a well-known result (see Section 1.7.3 for instance) that 

- for a\\ z E Z, X y-^ ^zi^) n-Lipschitz, 
and for all {z,x) E ZxX, 

- ~||-F|| ^ Fz{x) < F^{x) < \\Fz\\ < \\F\\, where || ■ || stands for the uniform norm, 

- {F^{x))n>i is a decreasing sequence and 

- \im^^^F^{x) = F,{x) 

For the last statement, note that it is necessary that Fz is upper semi continuous on X. 
But, this is insured by the assumption that Xo is closed and F^ G C{Xo). 

Now let us make sure that for any Xo E X, z ^-^ F^^Xo) is measurable. For all real a, 
we have 

F"(xo) <a ^ yy e X, Fz{y) - nd{xo, y) < a 
Wk > 1, Fz{xk) — nd{xo, Xk) < a 

where {xk',k > 1} is a countable dense subset of X (recall that X is Polish). This 
holds, since y ^— Fz{y) — nd{xo,y) is continuous. It follows that {z G Z; F^i^Xo) < a] = 
^k>i{z G Z] Fz{xk) — nd{xo,Xk) < a}- As z F^^Xk) is measurable for all k, this proves 
the measurability of 2; i-^ F^{xo). Therefore, F" belongs to JF for all n > 1. 

With the estimate -||F|| < F,(x) < F^{x) < < and the limit lim^^o^ ^'^'(x) = 
Fz{x), one can apply the dominated convergence theorem to obtain that 

(4.24) lim / {F^,q') ^{dz) = lim / F"rfg= / F dq = [ {Fz,q') fi{dz). 
Similarly, the limit 

(4.25) lim / sup{Fz{x) - Jz{x)} fi{dz) = / sup{Fz{x) - Jz{x)} fi{dz) 
follows from the estimate (14.81) and the following lemma. 

Lemma 4.26. Let J he an inf-compact [O,oo]-valued function on X and (/n)n>i 0, de- 
creasing sequence of continuous bounded functions on X which converges pointwise to 
some bounded upper semicontinuous function f. Then, (sup^g_;j.{/„(a;) — -^(a^)})„>i is a 
decreasing sequence and 

lim sup{/„(a;) - J{x)} = sup{/(a;) - J(a;)}. 

The proof of this lemma is put back after the proof of the present proposition. 
Finally, follows from fOD and (14:251) . 

• Equahty (b) is a variant of fl4.20p applied with the compact set X^. 



A LARGE DEVIATION APPROACH TO OPTIMAL TRANSPORT 



23 



• Equality (c). If the sequence C{X^)n>i were increasing, equality (c) would be a direct 
consequence of the monotone convergence theorem. Nevertheless, this is almost the case 
since, for any pair of closed subsets and Xi of X such that Xo C Xi, any function 
/ G C{Xo) can be approximated pointwise by a uniformly bounded decreasing sequence 
(/„) in C{Xi) such that lim„^oo sup^g;t.{/„(x) - J^{x)] = sup^^p^{f{x) - J^(x)}. One 
proves this, exactly as for inequality (a), by means of a Moreau-Yosida approximation 
and Lemma 14.261 With this in hand, equality (c) follows from the monotone convergence 
theorem. 

• Equality (d). This equality is a consequence of the following lemma. 

Lemma 4.27. Let J be a [0, +(X)]-valued lower semicontinuous function on X . 

IfCx in Lemma \4.1b\ is replaced with the set Qq = Un>i^('^??) ^here {X^)n>i is an 
increasing sequence of closed subsets of X such that lim„^oo Qi'^n) ~ then we still 
have 

sup < / f dQ — sup(/(a;) — J{x)) ( = J dQ. 

The proof of this lemma is put back after the proof of the present proposition. 

Note that we have already remarked that for g^-almost every z & is determined 

by the values {f,q^) where / describes the set Un>i^('^j?)- obtains equality (d) by 
means of Lemma 14.271 with Qgz = IJn>i ^i^n)^ z E Z. 

We have proved that X* > i and this completes the proof of the proposition. □ 



A comment on this proof. One could think of replacing the spaces C{X^) defined by 
f l4.22p with the smaller spaces C{X^) of continuous functions on X with their support 
in X^. This clearly provides an increasing sequence and simplifies the proof of equality 
(c). But unfortunately, equality (a) doesn't work anymore since C{Xo) reduces to the 
null space when the compact set X^, has an empty interior (a common feature in infinite 
dimension) . 

4.4. Proofs of the lemmas. We go on with the proofs of Lemmas 14.161 . 14.261 and 14.271 



Proof of Lemmas \4 -161 and \4-2'7\ Lemma [4. 161 is a particular case of Lemma [4. 2 7[ we only 
prove Lemma [4.271 

As, sup^gp^(/-J,i?) < snp^^;^{f{x)-J{x)} = sup^^;:^{f-J,6^) < sup^^p^ (/- J, 
we have sup^^;^{f{x) — J{x)} = sup^gp^(/ — J,R). Therefore, 



sup < (/, Q) - sup{/(x) - J{x)} > = sup < (/, Q) - sup (/ - J, R) 

f&GQ I ^GA- J /egg I R£Vx 



= (J,Q)+ sup (/-J, g)- sup (/-J,i?) 

< UQ) 

where the last inequality holds since Q G Vx- 

Now, let's prove the converse inequality. As J is a lower semicontinuous function which 
is bounded below, it is the pointwise limit of an increasing sequence ( Jn)n>i in Cx- once 
again, the Moreau-Yosida approximation: J„(x) = mi{J{y) + nd{x, y)]y E X}. 
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Let us define J„(x) = 1-^q{x){0\/ Jn{x) An) for all x and n. As {X^)n>i is an increasing 
sequence of sets, (J„)„>i is an increasing sequence of functions such that for all n, J„ is 
in C{X^). We have 

sup ((/, Q) - sup{/(x) - J(x)}| > sup ( (J„,,(5) - sup{J„(x) - J(a;)} 

feGg L x£X ) n>l V x&X 

(b) f 

> sup / JndQ 



n>l J X 



> sup sup / (0 V J„ A n) dQ, 

k>l n>k JxP 

— — k 



= sup / JdQ, 



fc>l Jx^ 



JdQ 



X 



where inequahty (a) holds since J„ G C{X^), inequality (b) follows from J„ < J, equality 
(c) holds since the sequence (X^) is increasing, equality (d) follows from the monotone 
convergence theorem and equality (e) follows from the monotone convergence theorem 
together with limfc_^oo Q{X \ Af^) = 0. This completes the proof of the lemmas. □ 



Proof of Lemma \4.26\ Changing sign and denoting gnix) = J{x) — /„(a;), g{x) = J{x) — 
f{x), we want to prove that limn^oo i'^i xex gn{x) = M^(zxg{x). 

We see that ((7„)„>i is an increasing sequence of lower semicontinuous functions. It 
follows by the Proposition 5.4 of [15] that it is a F-convergent sequence and 



(4.28) r- lim gn = lim gn = g. 

n^oo n— >oo 

Let us admit for a while that there exists some compact set K which satisfies 

(4.29) inf gn{x) = inf g^ix) 

x&X x&K 

for all n. This and the convergence 04.281) allows to apply Theorem 7.4 of [12] to obtain 
Yim.n^^'m.i^(,x gn{x) = inf^g;^ T- lim^^oo 5'n(a;) = mf^f^x g{x) which is the desired result. 

It remains to check that (14.291) is true. Let G A" be such that J(x*) < oo (if 
J = +00, there is nothing to prove). Then, inf gn{x) < gn{,x^) = J{x^) — fn{x^) < 
J{x*) — f{x^) < J(x*) — infa.g;f f{x). On the other hand, for all x and n, /„(x) < /i(x) < 
A := sup /i. Let B := A + 1 + J(x*) — inf^g;^ f{x). For all x such that J{x) > B, we have 
gn{x) > B — sup^g_:^^ fn{x) > B — A > J{x^^) — inf^g;^' / (x) + 1. We have just seen that for 
all n, 

inf gn{x) < J(xJ - inf f{x) 

x£X x^X 

inf gn{x) > J{x^) - inf f{x) + 1 

x;J{x)>B x£X 

This proves (14.291) with the compact level set = {J < B} and completes the proof of 
the lemma. □ 
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5. Large deviations of a doubly indexed sequence of random measures. 

Main results 

Theorem 14. 91 states a {k, n)-LDP for K!^ = ^ XliLi ^iz„ .) but we are mostly interested 
in the {k, n)-LD in Vx of -^n = ^ ^17=1 ^x*" • 1^ easily follow from Theorem 14.91 and 
the contraction principle. Let us denote 

P^{dx) = j P^{dx) fx{dz) eVx,k> I. 

Theorem 5.1. Suppose that 

(1) (/i„)„>i converges to /i in Vz, 

(2) for each k > 1, {P^; z & Z) is a Feller system in the sense of Definition \2. ^ 

(3) for each z & {P^)k>i obeys the k-LDP in X with the good rate function J^. 

Then the following statements hold true. 

(a) {L^}fc,n>i obeys the {k, n)-LDP in Vx with the good rate function I which is defined 
for all Q G Vx by 

(5.2) I{Q) = inf ( / J,(x) /i(rfz)n,(rfx); (n,),^^ : / ii{dz) = q\ 

where the transition kernels z E Z 11^ E Vx are measurable. 

(b) Another representation of this rate function is 

I{Q) = snp { [ Sf{z)fi{dz) - [ f{x)Q{dx)],Qe Vx 
feCx IJz Jx ) 

where Sf{z) is defined for all z E Z by 

5/(z) = inf{J,(x) + /(x)}. 

x^X 

(c) If I{Q) < +00, there exists a (possibly not unique) kernel (n*)^^^ which realizes 
the infimum in / (5. 

(d) If for each k the Feller system {P^)z<zz satisfies 

(5.3) P^ = P\- I I3{X) = z) 

for ^-almost every z ^ Z and some continuous function (3 : X ^ Z, we have 

(5.4) J^^^U^J,i.){x)Q{dx)^fPoQ = ^, 
^ ' v-T'/ I otherwise 

The dual space of {Cx, \\ ■ ||) is equipped with the *-weak topology (t{C';^,Cx), see 
Section II. 7[ 

Proof. Let us prove (a). As is the A'-marginal of K!^ and {K^} obeys the [k,n)-LDP 
with the good rate function i, the statement (a) follows from an obvious extension to the 
double index setting of the contraction principle (see [Sj): {L'^} obeys the (/c,n)-LDP in 
Vx with the good rate function 

(5.5) I{Q) = inf{^(g); q e Vzx : qx = Q} 
which is (ra . 

Let us prove (b). We rewrite the proof of Theorem 14.91 with instead of K^. As in the 
proof of Proposition 13. 14[ we replace F{z, x) by f{x) to obtain the pointwise convergence 
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of the normalized log-Laplace transforms 
(5.6) 

for all / G Cx, with 



(5.6) hm A,(/) = A(/) 

fe— >oo 



Afe(/) = lJjog{e'f,P^)fi{dz) and 



Hf) = / snp{f{x) - J^{x)} ij{dz). 
Jz xex 

Note that Afc(/) = Xk{Ff) and A(/) = A(F/) with F/(z,x) = /(x), so that (ESI) is a 
specialization of (14.61) . 

Exactly the same arguments as in the proof of Theorem 14.91 allow us to establish that 
{L^}k,n>i obeys the LDP in C'^ with the rate function A*(Q) = supjgc*^ {(/, Q) — A(/)} , 
Q e C'x- In particular, (14.71) and (14. lip become 

(5.7) |A,(/)| < 11/11, |A(/)| < 11/11, 
for all / G C;f and 

(5.8) r- lim A^. = A* 

fc— »oo 

in C;^, where these convex conjugates are taken with respect to the duality {C'p^^Cx)- 

Thanks to (I4.12p . (15. 5p and the uniqueness of the rate function (see |I1|), we see that 
{A* < +00} C Vx- We conclude as in the proof of Theorem 14.91 that {L\}k,n>i obeys the 
LDP in Vx with the rate function A*{Q) = sup^g^;^ {(/,<5) - Hf)} , Q e Vx- As the 
rate function is unique, 

(5.9) I = A*. 
Considering — / instead of / in sup^g^;^ leads to statement (b). 

Let us prove (c). As z is a good rate function, the result follows from the identity (15. 5p . 
Finally, statement (d) is a direct consequence of Lemma 15.131 below. □ 

Let us introduce the [0, oo]-valued functions / and Ik on C'x which are defined for all 
A; > 1 and g G by 

(5.10) 4(g) = miy^^H{q'\P^)fi{dzy,qeVzx-qz=fi,qx = Q'^ 

(5.11) /(g) = inf < / Jz{x) q{dzdx); q G Vzx ■ Qz = fJ',qx = Q \ 

where we use the same notation I{Q) for the function on Ux and its restriction to Vx 
(see (15. 5p ) and the convention that inf = +00. In particular, the effective domains of 4 
and / are included in Vx- 



As a by-product of the proof of Theorem 15.11 we have the following corollary. 
Corollary 5.12. [Hypotheses of Theorem \5.1f . The sequence {Ik)k>i T-converges to I in 

Proof. We have shown at (15.90 that A* = /. It is also true that A^ = 4, as can be shown 
by a minor modification of the proof of (13.170 . One concludes with (15.80 . □ 

During the proof of Theorem 15. H we have invoked the following 
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Lemma 5.13. [Hypotheses of Theorem 15. ij/ . If for each k the Feller system {P^)z^z 
satisfies h5.3\) with (3 continuous, I is given by 

(5 14) j(Q\ = [ !xJmi^)Qidx) if Q eVx and l3oQ = p n ^ f/ 

^ ' ' y +00 otherwise ' 

Proof. Let us first show that dom/ is included in Pp^p) := {Q E C'^; Q E Vx, PoQ = p}, 
whenever f3 is continuous. 

As a direct consequence of Proposition 13. 141 -c. we obtain for all Q E C'^ that 

/ fO) = / IH{Q\P^) if Q e P;, and /3 o Q = /i 
^ ' > few; I otherwise 

This holds with (3 measurable, see Remark 13.191 Hence, dom/^ C Ppip) for each k. 
Corollary 15.121 implies that dom/ is included in the closure of Ppip) in C^. As (3 is 
assumed to be continuous, {Q E C'^; {Q, go 13) = {fi, g),Wg E Cz} is closed in and one 
obtains the inclusion dom J C {Q E C'^; {Q,go(3) = {fj,,g),\/g E Cz}- On the other hand, 
dom J C Vx- Therefore, we obtain the desired inclusion dom / C {Q E C;^; {Q,g o [3) = 

{fi,g),^gECz}nVx = PM- 

This imphes that (15.111) admits the unique minimizer q*{dzdx) = p{dz)Q{dx \ [3{X) = z) 
and gives (15.141) . □ 

6. Applications to the optimal transport 

We apply the main results of Sections H] and [5] to the setting of Section [2l The space 
X = M^'^ is the space of the random couples and 2 = M'^ is the space of the initial 
positions. The empirical random measures and are specified by (12. 2p . (12.41) and 
(I2.16p . In the whole present section, the Assumptions 12.71 are supposed to hold. 

The spaces C^d and C^id of all continuous bounded functions on R'^ and M^'^ are 
equipped with their topologies of uniform convergence and their dual spaces C^^ and 
C^2d are equipped with the corresponding *-weak topologies, see Section II. 7[ It is conve- 
nient to use the notation 

' if yE A 

+00 if y ^ A 

which is called the "convex" indicator of the subset A (^^ is a convex function if and only 
if y4 is a convex set). Under the Assumptions 12.71 the assumptions of Theorem 15.11 are 
satisfied with 

J,{x) = c(xo, xi) + ^^0=^, X = {xo, xi) E M^'^, z G 

where c is given at (12.101) . Let be defined by (I2.17p . In the present setting, the functions 
Jfc and I defined at (15.100 and (15.110 are given for all p E C^2d by Ik = and I = S 
where 

Skip) = lHip\n'')+U(p)ip) 

Sip) = / cdp + ^UoiM 

with f^2dCdp = f^2d cixo,xi) pidxodxi) and 

^oip) = {p e C^2d; (p, (foXo) = {p, (p),y(p E C^d} 

and the convention that if(p|7r'^) = +oo and J^2d cdp = +oo for all p E C^2d \ V^2d. Of 
course, no(p) fl V]^2d is the set of all probability measures on R^'* such that po = P- 
The reason for introducing C besides V, is that the strong unit ball U of C is *-weak 



Uy) 
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compact, while compactness in V requires tightness criteria. This will considerably sim- 
plify the compactness arguments. 

To see that the identity about S holds true, observe that the canonical projection Xq is 
continuous. In particular, we have (15.31) with the continuous function (3 = Xq, which by 
Lemma [5.131 gives (15.141) . The identity about Sk is (15.151) with (3 = Xq. 
We shall also use the sets 

ni(i^) = {p e C^2d;{p,'fo Xi) = {i^,(f),WLp e C^d} and 

u{p,u) = no(/i)nni(z/). 

As Xq and Xi are continuous, Ilo{p) and ni(i^) are well-defined subsets of C^2d since 
ip o Xq and {p o Xi are in CiR2d. We use the same notation for Il{p, v) in V^2d and C^2d- 
We define for all v G V^d and all k 

= inf ^lH{p\7r') 



T{v) = inf / cdp 
and we set Tfc(z/) = T{u) = +oo whenever z/ G C^^ \ 'Pw. 

Caution. We'll denote similarly the rate functions Sk, S, Tk and T on C and their restric- 
tions to V. 

Lemma 6.1. For each k, 

(a) {A'£^'}„>i obeys the n-LDP in V^2d and C^jd with the good rate function kSk and 

(b) {N!^}n>i obeys the n-LDP in V^d and with the good rate function kT^. 

Proof. To get (a), apply Proposition I3.14( (b) follows by the contraction principle. □ 



Applying Theorem 15. H one obtains 

Theorem 6.2. The following assertions hold true. 

(a) {N^}k,n> obeys the {k, n)-LDP in V^d with the rate function v G V^d i— > 7^(/x, v) G 
[0,oo].' 

(b) For all v G Pk^, 

Tc{p,u)= sup <^ / Sif{xo) p{dxo) - / f{xi)u{dxi)\ 

with Sif{z) = inf^jgKd{c(z, xi) + /(xi)}, z G R'^. 

Remark 6.3. The statement (b) of this theorem is the Kantorovich duality ([21], Theorem 
1.3) and Theorem 15.11 (b) is a general version of this duality result. 

Similarly, we have the 

Proposition 6.4. The following assertions hold true. 

(a) {M^}k,n> obeys the {k,n)-LDP in V^2d with the rate function S. 

(b) For all p G P]R2d such that po = p 



cdp= sup < / Soig{xo) pidxo) - / g{xo, xi) p{dxodxi 

With Soig{z) = mi^^^^d{c{z, xi) + g{z, xi)}, z G W^. 
As a consequence of the preceding results, we have the 
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Theorem 6.5. The following assertions hold true 

(a) r-liiiifc^oo Sk = S in C^jd and V^2d 

(b) r- limfc^oo = T in C^^ and V^a. 

(c) Since V -Xmik^ooTk = T, for all v G V^d there exists a sequence {uk) in V^d such 
that liiiifc-^oo Vk = in V^d and limfc_^oo Tk{'Uk) = T{u) in [0, oo]. 

Proof. It is proved in [14] that in a Pohsh space A", if one has a fc-indexed family of n- 
LDPs with rate functions kik such that the doubly indexed sequence obeys the (weak) 
(fc, n)-LDP with rate function /, then F- linifc^oo -^fc = in ■ By Lemma WA\ and Theorem 
16.2^ it follows that the announced limits hold in the Polish spaces 'P^2d and "PRd. They also 
hold in C^d and C^2d since the effective domains of Sk and S and of Tk and T (considered 
as functions on C) are included in V^2d and V^d. This proves (a) and (b). Statement (c) 
follows from [15], Proposition 8.1. □ 

Let {gui > 1} be a countable subset ofC^d such that d^j, v) = J2i>i 2 *(| (fi'i, 7— ^) I Al), 
7, G Vfid is a metric which is compatible with the narrow convergence topology on V^d. 
For all u G V^d and all p G C^2d, define 

d{pi,u) = J2'2-\\{g,oX,,p) - {g„u)\ A 1). 

i>l 

Let us recall the three minimization problems 



(MK^) minimize —H{p\7!-'') + ad{pi,h') subject to p&Ilo{p). 



(MK") minimize / cdp + ad(pi, u) subject to p G no(yu). 

(MK) minimize / cdp subject to pGn(/i, z/). 

Theorem 6.6. Assume that Tc{p, v) < oo. 

(a) We have: lima^p p limfc^ op infpgno(^) {|-g(p|7r^) + ad{pi, z/)} = Tc(p, v). 

(b) For each k and a, (MK'^j admits a unique solution p^ in V^2d. For each a, {Pk)k>i 



is a relatively compact sequence in V^2d and any limit point of {p'^)k>i is a solution 
of [MW\) . 

(c) For each a, ^MK°'\) admits at least a (possibly not unique) solution p". The sequence 
(p°)a>i is relatively compact in V^2d and any limit point of (p°)a>i is a solution 
ofiMM- 



Proof. We introduce functions on C^2d corresponding to (MK^), (]MK"P and (IMKp . They 
are defined for all p G C^2d and each fc, a > 1 by 

Gtip) = Skip) + ad{pi, v) = ^Hipln") + CuoiM + (^d{pi, v) 

^{p) = S{p) + ad{pi,u) = cdp + ^uo^ip) + Oid{pi,u) 



G{p) = / cdp + ^nMip)- 

The domains of Sk and S are included in the strong unit ball U^2d of C^2d- Therefore, the 

domains of G^, Gk and G are also in U^2d which is cr(C^2d, C]R2d)-compact. 

We know that Sk, S are lower semicontinuous, d{pi, z/) is continuous and bounded below 
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and ni(z/) is closed. Therefore, G^, and G are inf-compact. 

As the relative entropy is stricly convex, is also strictly convex: it admits a unique 
minimizer p^. 

As a function of p, (i(pi, v) is a finite continuous function on G^2d- Together with the 
convergence V-lmik^oo Sk = S, this implies (see [15], Proposition 6.21) that for all a, 

r- lim G^ = G" in G^^^. 

Observe that limc^oo <yd{pi, z/) = ^ni(i/)(p) for all p G ^152^. As this limit is increasing, by 
^15j . Proposition 5.4 we have 

r- lim G° = G in G;^^. 

a— >oo 

Together with the relative compactness of the domains, these F-convergence results 
entail the whole theorem (see [15], Theorem 7.8 and Corollary 7.20). □ 

7. F-CONVERGENCE OF CONVEX FUNCTIONS ON A WEAKLY COMPACT SPACE 

This section is dedicated to the proof of Corollary 17.41 which is an important tool for 
the proof of Theorem 14. 9[ 

A typical result about the F-convergence of a sequence of convex functions (/„) is: If 
the sequence of the convex conjugates (/*) converges in some sense, then (/„) F-converges. 
Known results of this type are usually stated in separable reflexive Banach spaces. For 
instance Corollary 3.13 of H. Attouch's monograph [I| is 

Theorem 7.1. Let X be a separable reflexive Banach space and (fn) a sequence of 
closed convex functions from X into (—00, +00] satisfying the equicoerciveness assump- 
tion: fn{x) > a(||x||) for all X & X and n > 1 with lim.r-,+oo<^{i^)/f^ = +00. Then, the 
following statements are equivalent 

(1) / = seqX^-F-lim„^oo fn 

(2) r = X;-F-lim„^oo/: 

(3) VyGX*, /*(?/) =lim„^^/„*(y) 

where X* is the dual space of X, seqX^ refers to the weak sequential convergence in X 
and X* to the strong convergence in X*. 

Escaping from the reflexivity assumption is quite difficult, as can be seen in G. Beer's 
monograph [2]. 

In some applications in probability, the reflexive Banach space setting is not as natural 
as it is for the usual applications of variational convergence to PDEs. For instance when 
dealing with random measures on X, the narrow topology a{Vx,Cb{X)) doesn't fit the 
above framework since Cb{X) endowed with the uniform topology may not be separable 
(unless X is compact) and is not refiexive. 

The next result is an analogue of Theorem l7. ll which agrees with applications for random 
probability measures. Since we didn't find it in the literature, we give its detailed proof. 

Let X and Y be two vector spaces in separating duality. The space X is furnished with 
the weak topology cr(X, Y). 

We denote the indicator function of the subset G of X which is defined by C,c{x) = 
if X belongs to G and ^c{x) = +00 otherwise. Its convex conjugate is the support function 
of G : ^^{y) = sup^^c{x,y), V ^Y- 

Theorem 7.2. Let (gn) be a sequence of functions on Y such that 
(a) for all n, gn is a real-valued convex function on Y, 
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(b) (gn) converges pointwise to g := lim^^ooS'n, 

(c) g is real-valued and 

(d) in restriction to any finite dimensional vector subspace Z of Y, (gn) T -converges 
to g, i.e. F- lim„^oo(5'n + ^z) = 9 + ^z, where is the indicator function of Z. 

Denote the convex conjugates on X : fn = g^ and f = g*. 
If in addition, 

(e) there exists a compact set K C X such that dom/„ C K for all n > 1 and 
dom/ C K 

then, (/„) T -converges to f with respect to cr(X, "K). 

Remark 7.3. By ([I5], Proposition 5.12), under the assumption (a), assumption (d) is 
implied by: 

(d') in restriction to any finite dimensional vector subspace Z of F, {gn) is equibounded, 
i.e. for all yo G there exists 5 > such that 

supsup{|5(„(y)|;?/ G Z,\y-yo\ < 5} < oo. 



A useful consequence of Theorem 17.21 is 

Corollary 7.4. Let (F, || • ||) he a normed space and X its topological dual space. Let (gn) 
be a sequence of functions on Y such that 

(a) for all n, gn is a real-valued convex function on Y, 

(b) (gn) converges pointwise to g := \imn-,oo gn and 

(d") there exists c > such that \gn{y)\ < c(l + \\y\\) for all y eY and n > 1. 

Then, (fn) T-converges to f with respect to <j{X, Y) where fn = g^ and f = g*. 

Proof. Under (b), (d") implies (c). Since the functions gn are convex, (d") implies that 
{gn',n > 1} is locally equi-Lipschitz. Therefore (d") implies (d') and we have (d) by 
Remark I7.3[ Finally, (d") implies (e) with K = {x E X; ||a;||^, < c} where ||a;||* = 
supy ||y||<]^(x, is the dual norm on X. Indeed, suppose that for all y eY, g{y) < c-\-c\\y\\ 
and take x E X such that g*{x) < -\-oo. As for all y, {x,y) < g{y) + g*{x), we get 
I (a;, |/||y|| < {g*{x) + c)/||?/|| + c. Letting tend to infinity gives ||a;||* < c which is 
the announced result. 

The conclusion follows from Theorem 17. 2[ □ 

The proof of Theorem 17.21 is postponed after the two preliminary Lemmas 17.51 and 17. 1 1[ 

Lemma 7.5. Let f : X ^ {—oo, +oo] be a lower semicontinuous convex function such 
that dom/ is included in a compact set. Let V be a closed convex subset of X. 
Then, if V satisfies 



n>l 



(7.6) 



dom/ 7^0 or V^ncldom/ = 0, 



we have 




mf*{y) + Cv{-y))e{-^,^] 



mr{y) + Cw{-y)) = +^ 
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Proof. The proof is divided in two parts. We first consider the case where VCidomf ^ 0, 
then the case where V fl cldoni/ = 0. 

• The case where V fl dom/ 7^ 0. As is a nonempty closed convex set, its indicator 
function is a closed convex function so that its biconjugate satisfies = ^y, i.e. 
^v(yX) = supj^gy{(x, y) — ^v{y)} for all x E X. Consequently, 

inf /(x) = inf sup{/(x) + {x,y) -^v{y)}. 

One wishes to invert inf ^.g^ and sup^gy by means of the following standard inf-sup theorem 
(see p] for instance). We have mix^x supy^y F'{x,y) = supy^zy ^^ixex F{x,y) provided 
that infa;gx sup^^gy F{x, y) 7^ ±00 and 

- dom F is a product of convex sets, 

- X ^ F{x, y) is convex and lower semicontinuous for all y, 

- there exists yo such that x t— > F{x, yo) is inf-compact and 

- y I— >• F{x, y) is concave for all x. 

Our assumptions on / allow us to apply this result with F{x,y) = f\x) + {x,y) — iyiy). 
Note that 

(7.9) inf f{x) > -00 

xGX 

since / doesn't take the value —00 and is assumed to be lower semicontinuous on a 
compact set. Therefore, if inf^-gy /(x) < +00, we have 

inf /(x) = sup inf {/(x) + {x,y)-Cv{y)} = - inf {/*(?/)+ ^;)(-y)}. 

x£V y^y x€X y€Y 

• The case where V fl cldom/ = 0. As cldom/ is assumed to be compact, by Hahn- 
Banach theorem cldom/ and V are strictly separated: there exists yo E Y such that 
^v{yo) = sup^gy(x,?/o) < infcidom/(a;,i/o) < inf^.gdom/(a;, ?/o)- Hence, 

(7.10) inf {{x,yo)-Cv{yo)}>0 

xGdom / 

and 

-inUr{y)+Cvi-y)) = supinf{/(x) + (x,y)-ey(l/)} 

yer y^y xex 

= sup inf {/(x) + (x, y) - ^v{y)} 

y£Y x€aom j 

> inf /(x) + sup inf {(x, ayo) - ^y(ayo)} 

xGX a>0 x£dom f 

= inf /(x) + sup a inf {{x,yo) - ^viVo)} 

x£X a>0 x£dom J 

= +00 

where the last equality follows from (17.91) and (17.101) . This proves that (17.81) holds with 
W=V. 

• Finally, if (17.61) isn't satisfied, taking W such that W C intV insures the strict 
separation of W and cl dom / as above. □ 

Lemma 7.11. Let the cr(X, Y)-closed convex neighbourhood V of the origin be defined by 

V = {xeX; {yi, x) <1,1 <i < k} 

with k > 1 and yi, . . . ,yk G Y. Its support function is [0, oo]-valued, inf-compact and 
its domain is the finite dimensional convex cone spanned by {yi, . . . ,yk}- More precisely, 
its level sets are {^y < b} = bcv{yi, . . . ,yk} for each b > where cv{?/i, . . . is the 
convex hull of {yi, . . . , y^}. 
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Proof. The closed convex set V is the polar set of = {yi, . . . , Uk} '■ V = N°. Let Xi & V 
and Xo E E := ni<j<fcker Then, {yi,Xi + Xo) = {yi,Xi) < 1. Hence, Xi + Xo G V. 
Considering the factor space X/E, we now work within a finite dimensional vector space 
whose algebraic dual space is spanned by {yi, . . . ,yk}. 

We still denote by X and Y these finite dimensional spaces. We are allowed to apply 
the finite dimension results which are proved in the book [22] by Rockafellar and Wets. 
In particular, one knows that if C is a closed convex set in Y, then the gauge function 
1c{y) '■= inf{A > 0;?/ G XC},y G F is the support function of its polar set C° = {x E 
X; {x,y) < 1,V?/ G C}. This means that 7c = ^5° (^ee [22j, Example 11.19). 

As = {N°°y and A^°° is the closed convex hull of A^, i.e. A^°° = cv(A^) : the convex 
hull of A^, we get V = cv{N)° and 

= 7cv(Af)- 

In particular, for all real b, C,v{y) < b 7cv(Af)(2/) ^ b 4^ y E bcv{N). It follows that the 
effective domain of is the convex cone spanned by . . . , yk and is inf-compact. □ 

Proof of Theorem \7.S\ Let Af{xo) denote the set of all the neighbourhoods of Xo G X. We 
want to prove that F- lim„^oo /n(a;o) := sup{;g^(^^) lim„^oo inf^efJ = f{xo)- Since / 
is lower semicontinuous, we have f{xo) = svLp^^J^J■^^^■^mir^^lJ f{x), so that it is enough to 
show that for all U G Af{Xo), there exists V G M{Xo) such that V C U and 

(7.12) hm inf /„(x) = inf /(x). 

n^oo x€V xGV 

The topology a{X, Y) is such that Af{xo) admits the sets 

V = {xeX; \{yi,x- Xo)\ < l,i < k} 

as a base where {yi, . . . ,yk), k > 1 describes the collection of all the finite families of 
vectors in Y. By Lemma [7. 51 there exists such aVcU which satisfies 

inf fn{x) = — inf hn{y) for all n > 1 and inf f{x) = — inf h{y) 

xGV y€Y x£V y& 

where we denote K^y) = gn{,y) + ^y(-2/) and h{y) = g{y) + ^y(-y), y eY. 

Let Z denote the vector space spanned by (yi, . . . , yk) and h^, the restrictions to Z 
of hn and h. For all y eY, we have 

(7.13) Cvi-y) = -{^o,y) + Cv-xS-y) 

and by Lemma 17. llj the effective domain of C,y is Z. Therefore, to prove fl7.12p it remains 
to show that 

(7.14) hm inf /if(y)= inf /i^(y). 

By assumptions (b) and (d), (/if) F-converges and pointwise converges to h'^ . Note that 
this F-convergence is a consequence of the lower semi continuity of the convex conjugate 
^y and Proposition 6.25 of [ISj. 

Because of assumptions (a) and (c), (/if) is also a sequence of finite convex functions 
which converges pointwise to the finite function . By ([2T], Theorem 10.8), (/if) con- 
verges to uniformly on any compact subset of Z and is convex. 

We now consider three cases for Xo- 
The case where Xo G dom/. We already know that (/if) F-converges to . To prove 
fl7.14p . it remains to check that the sequence (/if) is equicoercive (see [15], Theorem 7.8). 
For all y eY, g{y) - {xo,y) > -fix,) and imply h^y) > -fix,) + ^^..^-y). 
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Since, —f{xo) > — oo and ^y_x^ is inf-compact (Lemma 17.111) . we obtain that is inf- 
compact. As (/if) converges to uniformly on any compact subset of Z, it follows that 
(/if) is equicoercive. This proves (17.141) . 

The case where Xo G cldom/. In this case, there exists x'^ G dom/ such that V = 
x'^ + {V - Xo)/2 = {x eX; \{2yi,x- x'^)\ <l,i<k}e M{x'^) satisfies Xo^V dV dU. 
One deduces from the previous case, that (17.141) holds true with V instead of V. 
The case where Xo ^ cldom/. As (/if) F-converges to /i^, by ([2J, Proposition 1.3.5) 
we have lim sup^^^^^ inf ^gy /if (?/) < inf^gy h^{y). As Xo ^ cldom/, for any small enough 
V e M{xo), infygy /i^(?/) = - i nf^gv/(x) = -oo. Therefore, lim^^oo inf j^gy /if (y) = 
infj,gy h{y) = — oo which is (17.141) . 

This completes the proof of Theorem 17.21 □ 
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